Battle of the Neighborhoods - Search for NYC Neighborhood for a Restaurant Business Start-Up

1 Business Problem

Target Audience


Investors who plan to engage in a new restaurant business venture in New York City area who has a great concept and selection of food to offer but has concerns with going ahead with the investment due to lack of understanding of the neighborhood targeted for the start-up.

Problem to Solve


Figure out the neighborhood to open the restaurant which significantly provide the best chance for the business to succeed. The concern is due to research findings by US Small Business Administration (SBA) that 30% of new business fails in the first 2 years, 50% in 5 years and 66% in 10 years.

The SBA findings should make it a necessity for new business investors to include and consider location data analysis in their research when putting up a business. Investors needs to get a good insight of the neighborhood the business will operate. For a new business venture such as a Restaurant to be viable, an investor has to ensure neighborhood is located in high foot traffic area with good proximity to city center. Investor also needs to be critical to population and crime trends.

To address these concerns, an exploratory data analysis have been conducted on all of the New York City boroughs using NYC Open Data containing historical crime, population and housing data. After getting a better insight of the boroughs, segmentation and clustering is conducted on its underlying neighborhood using the neighborhood’s social networking data made available by Foursquare API to figure out the ideal location for the business.

Why Target Audience Would Care


Restaurant business investors will find the result of the analysis invaluable as a good insight of the neighborhood provides for them the information to anticipate and better manage some of the risks that goes along with a new business start-up.

2 Data Utilized on the Research

Data Source

New York City Open Data

  • NYPD Complaint Historical Data 2006 to 2017
  • NYC Borough Boundaries (JSON)

New York City Planning

  • NYC Borough Population 1900 to 2010
  • NYC Total Housing Units 1940 to 2010

New York University Spatial Data Repository

  • NYC Bourough Neighborhood Coordinates (JSON)

Google Map API

  • Reverse Geo Coding

Foursquare API

  • Neighborhood Venues

Data Wrangling and Transformation Approach

New York City Open Data

New York City Planning

New York University Spatial Data Repository

  • NYC Bourough Neighborhood Coordinates (JSON)
    JSON file had to be downloaded separately with several missing neighborhood needed to be added to produced unbiased result.
    Source: https://geo.nyu.edu/catalog/nyu_2451_34572

Google Map API

A Google cloud account was opened in order to conduct reverse geo mapping. NYPD Crime Data only provides the latitude and longitude coordinates of the reported crime. The analysis requires the neighborhood crime data be incorporated in the investors decision making.

Foursquare API

A Foursquare developer account was opened in order to take advantage of its API which provides social networking location information about venues, users, and check-ins. The API enables the research to perform k-means clustering of New York City underlying neighborhood.

3 Methodology

The target location for the new restaurant business venture is New York City area. An exploratory analysis have been conducted of the area in order to identify neighborhoods which best provides the chance for the business to be viable. Certain aspects such as crime, housing and population trends were explored. Once specific borough location has been identified, one hot encoding is performed for the venues located in its underlying neighborhoods so k-means clustering algorithm can be employed to identify neighborhood venues similarities.

As have noted in the data section, due to space limitations and source website employing URL redirection, most data required have been down loaded into a personal computer were data wrangling and transformation have been conducted. Enhanced and transformed data sets were uploaded to the Cognitive Class labs environment for the analysis.

New York City Boroughs 2017 Reported Crime Map

              

New York City Historical Population Covering Period 1900 to 2010

              

New York City Historical Housing Units Covering Period 1940 to 2010

              

New York City Historical Reported Crime Covering Period 2006 to 2017

              

New York City 2017 Reported Crime Type Comparison

              

4 Result

All New York City Boroughs data have been taken into account in formulating a well thought-out decision. Given the data available, this process considers Queens New York City Borough as the preferred area for the new business venture.

Further exploratory analysis is conducted on Queens underlying neighborhood 2017 reported crime data. Note that in this section, a random sample of 10% of the overall 2017 data have been accounted for due to the amount of time it takes for the reverse geo code mapping to identify the neighborhood the crime took place and also the cost that goes along with the extensive calls using the GOOGLE Map API.

Result of the exploratory analysis and k-means clustering algorithm employed to Queens New York City neighborhood venues utilizing Foursquare API is also presented in this section. With regards to the crime related result, as have noted, sampling was considered that likely efferct the results. Additional views involving these resulting data are presented in the subsequent section.

Queens Borough 2017 NYPD Reported Crime Map - Utilizing Crime Geographic Coordinates

              
New York City neighborhood geographic boundaries is sourced from New York University Spatial Data Repository https://geo.nyu.edu/catalog/nyu_2451_34572 and k-means and clustering algorithm was employed to these neighborhood using Foursquare API with set limit of 100 venues and radius of 500 meters.

Queens Borough Neighborhood

              

K-Means and Clustering Algorithm employed to Queens Borough Neighborhood utilizing its Venues

              

5 Discussion

Different aspects of the neighborhood explored in the analysis combined provides added views for business investors to be able to pinpoint the optimal location for the targeted restaurant business start-up considering specific preferences.

Top 10 Queens Neighborhood Closest to New York City Center With Lowest Crime

              
Neighborhood Cluster Labels Offense Distance Venue 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue
72 Hunters Point 4 0.0 1.950069 75 Italian Restaurant Café Japanese Restaurant Brewery Comedy Club
80 Queensbridge 0 0.0 2.789271 17 Hotel Sandwich Place Athletics & Sports Ramen Restaurant Baseball Field
10 Long Island City 0 616.0 2.849199 68 Hotel Coffee Shop Pizza Place Mexican Restaurant Café
74 Blissville 0 0.0 2.905480 20 Hotel Donut Shop Deli / Bodega Bus Station Clothing Store
11 Sunnyside 0 46.0 3.226495 42 Pizza Place Italian Restaurant Chinese Restaurant Discount Store Coffee Shop
57 Ravenswood 1 0.0 3.610710 26 Grocery Store Spanish Restaurant Liquor Store Chinese Restaurant Latin American Restaurant
73 Sunnyside Gardens 2 0.0 3.760672 100 Bar American Restaurant Grocery Store Coffee Shop Pizza Place
0 Astoria 0 318.0 4.563659 100 Middle Eastern Restaurant Bar Greek Restaurant Hookah Bar Bakery
1 Woodside 0 149.0 4.603483 74 Grocery Store Bakery Thai Restaurant Pub Pizza Place
14 Ridgewood 1 154.0 4.761492 42 Grocery Store Bank Restaurant Italian Restaurant Pizza Place

Top 10 Queens Neighborhood With Most Venues and Lowest Crime

              
Neighborhood Cluster Labels Offense Distance Venue 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue
73 Sunnyside Gardens 2 0.0 3.760672 100 Bar American Restaurant Grocery Store Coffee Shop Pizza Place
83 Ditmars Steinway 1 30.0 5.158936 100 Deli / Bodega Greek Restaurant Bakery Italian Restaurant Bagel Shop
0 Astoria 0 318.0 4.563659 100 Middle Eastern Restaurant Bar Greek Restaurant Hookah Bar Bakery
2 Jackson Heights 0 263.0 5.666396 84 Latin American Restaurant Mobile Phone Shop South American Restaurant Peruvian Restaurant Mexican Restaurant
72 Hunters Point 4 0.0 1.950069 75 Italian Restaurant Café Japanese Restaurant Brewery Comedy Club
22 Bayside 0 50.0 11.432189 74 Bar Pizza Place Spa American Restaurant Donut Shop
1 Woodside 0 149.0 4.603483 74 Grocery Store Bakery Thai Restaurant Pub Pizza Place
10 Long Island City 0 616.0 2.849199 68 Hotel Coffee Shop Pizza Place Mexican Restaurant Café
9 Flushing 0 2244.0 8.477245 67 Chinese Restaurant Bubble Tea Shop Korean Restaurant Karaoke Bar Hotpot Restaurant
49 Rockaway Beach 0 13.0 13.392816 47 Beach Ice Cream Shop Brazilian Restaurant Latin American Restaurant Seafood Restaurant

Top 10 Queens Neighborhood With Lowest Crime and Most Venues

              
Neighborhood Cluster Labels Offense Distance Venue 1st Most Common Venue 2nd Most Common Venue 3rd Most Common Venue 4th Most Common Venue 5th Most Common Venue
73 Sunnyside Gardens 2 0.0 3.760672 100 Bar American Restaurant Grocery Store Coffee Shop Pizza Place
72 Hunters Point 4 0.0 1.950069 75 Italian Restaurant Café Japanese Restaurant Brewery Comedy Club
31 Jamaica Center 0 0.0 10.150977 40 Mobile Phone Shop Caribbean Restaurant Clothing Store Donut Shop Coffee Shop
44 Steinway 0 0.0 5.431322 29 Deli / Bodega Rental Car Location Cosmetics Shop Sushi Restaurant Women's Store
60 Lefrak City 0 0.0 6.551668 29 Department Store Cosmetics Shop Bakery Deli / Bodega Fried Chicken Joint
57 Ravenswood 1 0.0 3.610710 26 Grocery Store Spanish Restaurant Liquor Store Chinese Restaurant Latin American Restaurant
55 Queensboro Hill 0 0.0 8.521099 25 Chinese Restaurant Bank Bakery Pizza Place Asian Restaurant
67 Forest Hills Gardens 0 0.0 7.753192 22 Bakery Grocery Store Playground Park New American Restaurant
74 Blissville 0 0.0 2.905480 20 Hotel Donut Shop Deli / Bodega Bus Station Clothing Store
80 Queensbridge 0 0.0 2.789271 17 Hotel Sandwich Place Athletics & Sports Ramen Restaurant Baseball Field

6 Conclusion

The process can definitely be fitted considering the business investors preferences but it clearly provides the insight in the New York City borough chosen and its underlying neighborhoods. Whether the result provides the investors the comfort level to push through with the investment or reevaluate and hold back with some area preferences, it enables them to managed some of the major risks and concerns that goes along with a new business start-up.

There were gaps in the data source that forced the process to employ Reverse Geo Code mapping using the Google Map API, data sampling and for JSON Spatial Data be manually updated. A much reliable result could have been made available with it addressed.